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Abstract 

Accurate wind power forecasts depend on reliable wind speed forecasts. Numerical 
Weather Predictions (NWPs) utilize huge amounts of computing time, but still have 
rather low spatial and temporal resolution. However, stochastic wind speed forecasts per¬ 
form well in rather high temporal resolution settings. They consume comparably little 
computing resources and return reliable forecasts, if forecasting horizons are not too long. 

In the recent literature, spatial interdependence is increasingly taken into consideration. 

In this paper we propose a new and quite flexible multivariate model that accounts for 
neighbouring weather stations’ information and as such, exploits spatial data at a high 
resolution. The model is applied to forecasting horizons of up to one day and is capable 
of handling a high resolution temporal structure. We use a periodic vector autoregres¬ 
sive model with seasonal lags to account for the interaction of the explanatory variables. 
Periodicity is considered and is modelled by cubic B-splines. Due to the model’s flexibil¬ 
ity, the number of explanatory variables becomes huge. Therefore, we utilize time-saving 
shrinkage methods like lasso and elastic net for estimation. Particularly, a relatively newly 
developed iteratively re-weighted lasso and elastic net is applied that also incorporates 
heteroscedasticity. We compare our model to several benchmarks. The out-of-sample fore¬ 
casting results show that the exploitation of spatial information increases the forecasting 
accuracy tremendously, in comparison to models in use so far. 

Keywords: Wind speed; Forecasting; Periodic vector autoregressive model; Periodic B-splines; Iter¬ 
atively re-weighted lasso method 

Addresses: 

^ Corresponding Author: Daniel Ambach, European University Viadrina, Chair of Quantitative 
Methods and Statistics, Post Box 1786, 15207 Frankfurt (Oder), Germany, Tel. -|-49 (0)335 5534 2983, 
Fax -|-49 (0)335 5534 2233, ambach@europa-uni.de. 

^ Carsten Croonenbroeck, European University Viadrina, Chair of Economics and Economic Theory 
(Macroeconomics), Post Box 1786, 15207 Frankfurt (Oder), Germany, Tel. -|-49 (0)335 5534 2701, Fax 
-1-49 (0)335 5534 72701, croonenbroeck@europa-uni.de. 



1 Introduction 


The progressing energy turnaround in Europe has one designated goal: Reducing the depen¬ 
dence on fossil energy and thus, increasing the fraction of energy production that is obtained 
from renewable sources. Other renewables like solar power or geothermal power aside, wind 
power is the most successful one, in terms of installed capacity as well as in the aggregated 
amount of power production. Contrary to fossil power however, wind power production is er¬ 
ratic and non-deterministic. Thus, accurate predictions are crucial for efficient market clearing 
as well as network dispatching, as Soman et ah (2010) and Croonenbroeck and Dahl (2014) 
point out. Short- to medium-term wind power forecasting (up to one day) is a wide field of 
research. These predictions are usually carried out by stochastic approaches, as discussed by 
Wu and Hong (2007). 

Wind power prediction depends basically on wind speed predictions. However, wind power 
forecasting is a task on its own, i.e. power production depends on the turbine type, among 
other things, as Burton et ah (2011) indicate. Wind speed prediction models have been under 
constant development and improvement during the recent decade, at least. Early work is done 
by Haslett and Raftery (1989). They propose a long-memory autoregressive moving average 
model for wind speed. Ewing et ah (2006) investigate heteroscedasticity of high frequency 
data (15-minute observation frequency). Thus, Koopman et al. (2007) and Taylor et al. (2009) 
apply an ARFIMA-GARCH model (autoregressive fractionally integrated moving average with 
generalized autoregressive conditional heteroscedasticity) with a seasonal component in the ex¬ 
planatory variables. 

Instead of univariate processes, more recently, researchers exploit spatial information of the 
data, as weather stations are available at close proximity to each other in many cases. ? de¬ 
velop a spatial regime-switching model. Furthermore, Zhu et al. (2014) provide an extended 
regime-switching approach. Saltyte Benth and Saltyte (2011) model the spatial dependence 
of the daily wind speed by a Gaussian random field. Aguera-Perez et al. (2013) and Santos- 
Alamillos et al. (2014) investigate the spatial structure of the data as well. 

In this paper we come up with a very general and flexible model that uses spatial wind speed, 
wind direction, temperature and air pressure data. The interaction of the data is modelled 
by a periodic vector autoregressive model (SVAR) with explanatory variables (referred to as 
“X”), which are, in our case, just several periodic regressors. Gontrary to classical Fourier 
modeling for the periodicity, we use more flexible B-splines. The heteroscedastic variance is 
modelled by a threshold autoregressive conditional heteroscedasticity (TARGH) model, also 
with explanatory variables (which are again, in our case, periodic regressors). In the end, our 
model is an SVARX-TARGHX model with adaptive lag selection. Due to the huge parameter 
space, classical maximum likelihood (ML) estimation would consume a lot of computing time. 
Instead, we use the popular shrinkage methodology, applied by means of the elastic net (see ?) 
and lasso (least absolute selection and shrinkage operator) method (see Tibshirani, 1996). In 
the context of autoregressive and time series models, Ren and Zhang (2010), ? and ? apply 
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lasso type and elastic net methods. Evans et al. (2014) use the lasso and several other empirical 
models to enhance the forecasting performance of a wind farm. The lasso and the elastic net 
are efficient ways to perform model selection and model htting in one step. These algorithms 
operate quite fast and do not require any distributional assumption. Recently, Ziel et al. (2015) 
proposed an iteratively re-weighted lasso method which incorporates heteroscedasticity within 
the variance part. 

We apply our model to a data set that consists of seven weather stations in Germany. The 
stations record weather data at a frequency of ten minutes. Per station, we have three full 
sample years, i.e. roughly 158,000 observations. With this data set, we calculate forecasts 
from our model and from several benchmark models, i.e. the persistence model, an AR, a VAR 
model and an ARFIMA-APARCH model (ARFIMA with asymmetric power generalized autore¬ 
gressive conditional heteroscedasticity), as applied by Ambach and Schmid (2014). The novel 
multivariate approach is calculated by two different estimation procedures, the lasso method 
and the elastic net. We compare all models according to their forecasting accuracy and give a 
proposal of the best modeling approach. 

The article is structured as follows. In Section 2, we briefly describe the data set and some 
general properties of the data set. Furthermore, the univariate and the benchmark models are 
described. The novel multivariate approach is introduced. Section 3 describes the in-sample 
results of our new modeling approach. The out-of sample results are provided in Section 4. 
Finally, Section 5 concludes. 


2 Wind speed data and model description 

The spatial area of investigated wind speed data is shown in Figure 1. In this article we focus on 
the most central station Lindenberg [5]. We use Miincheberg [2] as a reference station and for 
spatial information, also use its neighbours. They are situated in Eastern Germany in a region 
of rural plains. This region is perfect for wind parks. The data is measured by the “Deutscher 
Wetterdienst” (DWD) and reaches from January 2009 to December 2011. For model fitting, a 
time frame of two and a half years is used and the remaining months (July 2011 to December 
2011) are used for out-of-sample forecasts. The wind speed (IF’m,t)me{i,...,M},te{i,...,T} is measured 
in m/s in a 10-minute interval for a station m at time t. Figure 2 shows the wind speed data, 
the corresponding histogram and the autocorrelation function (AGF) for stations Lindenberg 
and Miincheberg. The observed wind speed possesses strong (conditional) volatility and the 
histogram of the wind speed shows a non-negative and positively skewed distribution. Moreover, 
we observe the presence of autocorrelation, which is related to the high frequency of the wind 
speed data. The autocorrelation function shows periodic characteristics. To investigate the 
periods, we calculate the smoothed periodogram, which is given in Figure 3. The peaks in this 
picture depict a periodic behavior for several frequencies, which correspond to diurnal si = 
1/0.000019 = 52,560 and annual S 2 = 1/0.00694 = 144 periods (note that at a data frequency 
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which provide lOmin data. 


of 10 minutes, there are 144 obs./day and 52,560 obs./year). We observe autocorrelation, a 
periodic behavior and heteroscedasticity, just as Ewing et ah (2006), Taylor et ah (2009) and ?. 
According to the fact that wind speed is a spatial phenomenon, it is reasonable to include the 
available and relevant information of neighbouring measurement stations. We will emphasize 
this idea with an example. If the wind is blowing from the north to the south and we observe 
this information at station Miincheberg, we are able to use this information for the station 
Lindenberg (cf. Figure 1). If the wind comes from the south, we can consider the observations 
from Cottbus. The investigated area of our data set contains seven different stations and 
Lindenberg is the midpoint. 

Figure 4 presents the pairwise correlation of all variables of all stations for the entire in-sample 
time frame. Blue denotes positive correlation, red denotes negative correlation, darker colors 
indicate greater values. Crossed boxes represent insignihcant correlation coefficients. It can 
be seen that aside from six exceptions, all correlation coefficients are significant, but some are 
close to zero. Miincheberg’s air pressure is barely correlated to any of the other variables. Each 
station’s temperature data is strongly positively correlated to each other station’s temperature 
data. The same holds for the wind direction. The correlation between most groups of different 
types of variables is weak. Temperature and wind speed are weakly negatively correlated, wind 
speed and air pressure are somewhat stronger positively correlated. Eventually, we observe 
a huge positive correlation between the wind direction and the wind speed, which will be 
captured by our new multivariate model. Figure 5 depicts the wind rose for our target 
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Figure 2. Wind speed data recorded at a 10-minute frequency, corresponding histogram and plot of the auto¬ 
correlation function. 


station Lindenberg (left panel). Wind roses show the densities of wind speeds, dependent on 
the wind direction. As the wind speed densities are rather large for eastern, north-eastern and 
south-eastern directions, we conclude that Lindenberg is mostly affected by winds coming from 
directions at which Baruth, Doberlug-Kirchhain and possibly Berlin-Schonefeld are located. 
Accordingly, Figure 5 shows the wind rose for station Miincheberg on the right panel. 


2.1 Benchmark models 

Univariate wind speed models are based on a decomposition of the time series into a time- 
dependent intercept, a trend component and a seasonal part. A general model can be 

Wt = Intercepti + Trendf + Seasonalt + e^, (1) 

where Wt represents the wind speed and e* is the residual process with zero mean and time 
dependent variance at for time t. One very general model of this type is the ARFIMA-APARCH 
approach with periodic regressors as proposed by Ambach and Schmid (2014), which nests 
various other models formerly proposed. As we contribute a novel multivariate model, we 
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Spectrum Lindenberg 

K=9.637343e-06 



Spectrum Muencheberg 

K = 9.637343e-06 



Figure 3. Estimation of the spectral density. 

use the ARFIMA-APARCH merely as a competitive benchmark from the class of univariate 
models. As in the original motivation of this approach, we include periodic B-splines instead 
of Fourier series to capture periodicity/seasonality. The intercept and the seasonal components 
are modelled by 

Interceptt + Seasonak = (2) 

i\ =2 i2=2 

where and are up to ki and /c 2 time dependent periodic functions. The residual 

process et is stationary and follows a ARFIMA-APARCH. Thus, it is given by 

et = (j){B){l-BYXt = e{B)Zt, 

Zt = hiPt where r/t ~ IFAf(0, 

Q P (3) 

hf = ao + Z) Oii{\Zt-i\ — 'yiZt-iY -|- Z 
1=1 1=1 


where d G (—0.5, 0.5) is the differencing parameter and B is the backward shift operator with 
B^Xt = Xt-u- The ARMA(p,q) polynomials 0 and 9 are 4>{B) = 1 — (piB — ... — ((ipB^ and 
6{B) = 1 + 6iB + ... + 9qB^ and have no common factors. The roots of 0 and 9 must he 
outside the unit circle. ? point out that instead of the conditional variance, the conditional 
standard deviation with power 5 follows an asymmetric power ARCH (APARCH) process. The 
asymmetry parameter is 7 ; G [— 1 , 1 ] for I = 1 ,...,Q. ? provide the existence of a stationary 
solution for the APARCH process by Z^i (y-iE{\z\ — 'yizY + Zili A < 1- 
The parameters of this ARFIMA-APARCH model are usually estimated by means of a quasi 
maximum likelihood (QML) estimation procedure, which can be slow, requires a strict distri¬ 
butional assumption and is prone to the usual problems of numeric solving, e.g. identihcation 
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Figure 4. Plot of pairwise correlation for all dependent and independent variables and all stations, using plain 
sensor data. 
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Figure 5. Wind rose, where the wind speed frequencies are plotted by wind direction for station Lindenberg 
(left) and station Miincheberg (right). 


of the global maximum. 

Ambach and Schmid (2014) describes seasonality modeling by means of periodic B-splines, 
which are becoming more and more accepted in the literature. Thapar et ah (2011), Bazilevs 
et ah (2012) and Le Guyader et ah (2014) show the power of this approach. The fundamental 
basics of the spline functions are dehned by ? and ?. In the application, we dehne our B-Splines 
by a set of equidistant knots such that each function oscillates once per day. Also, we use a 
second setup of functions to capture annual periodicity. All B-Splines are twice continuously 
differentiable. Figure 6 depicts our diurnal functions. 

Aside from the nested ARFIMA-APARCH model for cj, we consider a simple VAR{p) (vector 
autoregressive) and an AR{p) process. The vector contains wind speed, but also several other 
variables, which are described in Section 2.2. Finally, we investigate the usual persistence or 
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Figure 6. Diurnal cubic B-splines. 

“naive” benchmark (see ?). The naive predictor uses the current value as a forecast for future 
values Wt+k = Wf 


2.2 Novel multivariate wind speed prediction model 

In contrast to the established univariate modeling by, e.g., the above-mentioned ARFIMA- 
APARCH process, we introduce a multivariate model. Basically, our model is a VAR, but the 
mean is enriched by seasonal functions and the conditional variance of the residual process 
is modelled by a threshold autoregressive process (TARCH). The mean part as well as the 
variance part may contain external periodic regressors (X), so the model can be described as 
an SVARX-TARCHX approach. 

We decompose our wind direction information into an east-west component, given by sin(a2;m,i), 
and a north-south component, given by cos(aZm,i), where aZm,t is the azimuth, as measured 
at station m for time t. The SVARX part captures wind speed information, but also, air 
pressure information APm,t and temperature Crn,o measured in degrees Celsius are incorporated. 
For stations m = 1,...,M and time point t = 1,...,T, the dependent vector consists of 
observations for 5 variables, i.e. Yt G which is 

Yt= •••, Wm,u •••, sm{azM,t), 

cos(azi,t), ..., cos{azM,t), APpj, ..., APm,o Cpt, ..., Cm,* 
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( 4 ) 






The model is accounting for an overall relationship between speed and direction variables across 
all sites. The multivariate dependent variables in are described by a vector autoregressive 
model for the mean part given by 









■*^4 + X! t-j + e*, 
i=i 

>?o + i; +1; + y y 

2i=2 22=2 2i=2 22=2 

fcp ^^2 ^2 

+ E + E + E E K,i2dfilit)fi2it)^ 

2i=2 22=2 2i=2 22=2 


(5) 

( 6 ) 

(7) 

( 8 ) 


where {^7^} is i.i.d., E(r 7 j = 0 and Var(? 7 j) = 1. Moreover, 'Sq is an (M ■ 5) x 1 intercept vector 
and '*9i,i2 and '*9*1,12 are (M ■ 5) x 1 periodic coefficient vectors. <pQ j, *Pi,i 2 ,j ^^*7 

4>i^,i2,j are (M ■ 5) x (M ■ 5) parameter matrices of autoregressive and periodic autoregressive 
parameters for lag j G N. The periodic B-spline functions ///(t) and are similar to 

Section 2.1. Finally, cr^ and rj^ are (M ■ 5) x 1 vectors which follow a TARCH process (see 
Glosten et ah, 1993) given by 

ki P Q 

(Tt = (Xq + + ^t-l, (9) 

* 1=2 h=l 1=1 

ki 

C,A = CoA+ (10) 

21=2 

V’t,/ = V^o,z + E (71) 

21=2 

where ckq and cxi^ are (M ■ 5) x 1 parameter vectors. Fnrthermore, Co,hj '4’o,ii Cn,/* 
antoregressive and periodic antoregressive parameter matrices ((M ■ 5) x (M ■ 5)) within the 
variance. Hence, the M ■ 5 x 1 vectors of indicator fnnctions p and are given by 


T+ 

, m , t—l 


0 ) < 0 


j 1) , m , t—l ^ 0 

5 -^A/*,2n,t—1 I 5 

1 ^ 0 , 1 ^ 0 


( 12 ) 


where M G {1, ...,5} represents the A^th regressor. These vectors provide one way to model 
a TARCH process for a station m. Therefore, we donble the parameter space, bnt are able 
to differentiate between negative and positive shocks. After all, the number of parameters 
amounts to 


2 ■ (T + 2 ((/ci — 1) T {k2 — 1) T (^1^2 — 7) T 7) T (^i — 1) + H + Q + 1 + ‘^ki ), 
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which may be much larger than the number of observations. Thus, regularization and shrinkage 
are necessary. 

? use an iteratively re-weighted least squares approach to develop accurate load forecasts. 
More recently, Ziel et ah (2015) implement an iteratively re-weighted lasso method to predict the 
electricity price. We take the lasso and the elastic net as estimation methods into consideration, 
as they are designed to reduce the parameter space in the estimation stage. Ren and Zhang 
(2010) take the lasso method in the context of vector autoregressive models. The weighted lasso 
estimation of the parameter vector for both the SVARX model (5) and the TARCHX 
model (9) is given by 

T Pj 

arg min ^ ( (XA/.m.t T ^jV,m,T ^ (13) 

f—l j — l 

where Ut is the weight vector cj = (cui,..., ojt), Pj being the number of elements of the parameter 
vector and the tuning parameter is \jv,m.,T > 0. The second estimation method is the 

elastic net (see ?) which is given by 

^^/,m = arg min E^l(lV,m,t - UJtX^f^rn,tOAf,m,tY +>^jV,m,TaYfjLl \0j\ (14) 

where a G [0,1] provides an additional tuning parameter. In the case of a = 1, we obtain the 
lasso case and if a = 0, we obtain the ridge regression. Finally, if a G (0,1), the penalties 
represent the elastic net. 

The estimation is done in two-steps. We start with the estimation of the mean model and 
afterwards, we re-weight the mean model using ct*. Figure 7 provides the estimation scheme for 
the iteratively re-weighted lasso method (see Efron et ah, 2004, Ziel et ah, 2015). During the 
algorithm, we use the Akaike information criterion (AIC) for the model selection. The procedure 
is repeated until some convergence is achieved. If /C = 1, we obtain the homoscedastic estimates 
without re-weighting the lasso. The advantage of this approach is the fast computing time, and 
further extensions are still possible. Clearly, we have to choose the specific lags of the AR and 
the TARCH part, which we discuss in the next section. 

From the modeling perspective, we expect that the iteratively re-weighted lasso or elastic net 
method should be superior compared to the ML/QML approaches, because we are able to 
handle a huge parameter space with a variable selection and optimization algorithm. Moreover, 
we do not need a specific distributional assumption. 


3 Model fitting results 

The ARFIMA-APARCH process presented in Equations (2) and (3) is our extensive benchmark 
model. Here, this model is estimated by a QML approach under normally distributed residuals. 
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Figure 7. Estimation scheme. 
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According to the complex ARFIMA-APARCH, we determine a sparse parametrization. The 
in-sample resnlts are comparable to previons findings by Ambach and Schmid (2014) and Taylor 
et ah (2009). 

The novel wind speed approach is described in Eqnations (5) and (9). The estimation of the 
model is done by an iteratively re-weighted method. We distingnish between two estimation 
methods, the lasso method (13) and the elastic net (14). An advantage of these methods is 
that in contrast to the QML estimation of the ARFIMA-APARCH benchmark model, we do 
not need a distribntional assnmption for the SVARX-TARCHX model. 

The wind speed time series shows a hnge presence of antocorrelation with a strong dinrnal 
structnre. Therefore, we choose to take an antocorrelation strnctnre of abont two days, which 
is J = 288 -|- 1. The TARCH conditional variance strnctnre of onr model inclndes a periodicity 
of two days, i.e. Q = P = 288-1-1 lags. The covariance strnctnre between each station and each 
response variable is also modelled by the vector antoregressive lags within mean and variance 
part, bnt some interactions may be set to zero during the lasso/elastic net iterations. 

Figure 8 depicts the autocorrelation functions of station Miincheberg and Lindenberg for the 
standardized residuals of the wind speed. Each 2x2 block represents the respective angle on 
the diagonals: The main diagonal (top-left and bottom-right) represents the ACF of stations 
2 and 5. The off-diagonal (top-right and bottom-left) represents cross-correlations: Top-right 
represents the ACF of station 2 on station 5, bottom-left represents ACF of station 5 on station 
2. Quite few of them (in all cases, fewer than 5 %) are significant. 

Figure 9 depicts the autocorrelation functions of station Miincheberg and Lindenberg for the 
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Figure 8. Autocorrelation function (ACF) of {e*} for the new model with lasso method (first and second column) 
and ACF of the elastic net (third and forth column). 


absolute standardized residuals of the wind speed series {|et|} and both estimation methods. 
The ACFs in Figures 8 and 9 show only a minor presence of remaining autocorrelation, as there 
are very few single spikes outside the confidence bands. 

Additionally to ACF plots, we calculate the Ljung-Box test for {et} and {|et|}. Applying a 
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level of significance 5%, we cannot reject the nnll hypothesis of independence. After all, the 
autocorrelation analysis suggests an excellent model fit, especially due to the fact that almost 
no periodic structure remains in the residuals. Therefore, we expect proper forecasting results. 
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Figure 9. Autocorrelation function of {|et|} (ACF) for the new model with lasso method (first and second 
column) and ACF of the elastic net (third and forth column). 


4 Out-of-sample forecasting results 

In this section, we evaluate our model and the benchmark approaches according to their predic¬ 
tion performance. Common criteria are the root mean square error (RMSE), the mean absolute 
error (MAE) and the probability integral transform (PIT) histogram. PIT histograms allow 
for an insight into calibration and sharpness of forecasts, as ? point out. The forecasts are 
performed for a time frame from July 2011 to December 2011 for out-of-sample forecasts. We 
select N = 1000 points in time (r^*\i = in the out-of-sample period at random. 

Forecasts are calculated at horizons of up to a maximum of one day (i.e. 24 hours = 144 steps), 
as 24 hours conclude the short- to medium-term forecasting horizon of wind speed, as, e.g., ? 
and ? point out. RMSE and MAE are calculated by 


RMSEo 

MAEo 


N 


'y (^"A/’,m,Td)+o ^vy,m,Th)-|-o 




1 " I 

i=l 


)^ 


(15) 

(16) 
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where Xv,m,T(')+o is the o-step forecast of wind speed and dV^m,r(0+o is the actual observation. 
Figures 10 and 11 present the out-of-sample aggregated forecasting error results. In all cases, 
our novel model is able to outperform the benchmark models. Mostly, even the competitive 
ARFIMA and the extensive VAR(p) model are outperformed substantially. Additionally, we 
observe that the naive model is clearly outperformed in each case. For few shorter forecasting 
steps, the highly persistent VAR(p) model returns weakly lower errors. Indeed, these results are 
only obtained for the station Lindenberg. As forecasting horizons increase beyond one hour, 
our novel model is the overall winner. Finally, we are able to conclude that the iteratively 
re-weighted lasso and elastic net approach outperform the benchmark methods. Looking at 
MAE, the elastic net outperforms lasso. RMSE however leaves no distinct result for lasso and 
elastic net. 

Figures 12 and 13 show the PIT histograms for shorter-term forecasting horizons of one and 




Figure 10. RMSE (first column) and MAE (second column) 
all models. 


by forecasting horizon for station Miincheberg for 




Figure 11. RMSE (first column) and MAE (second column) 
all models. 


by forecasting horizon for station Lindenberg for 


six hours and the iteratively re-weighted lasso and elastic net forecasting technique. As to be 
expected, we observe that the calibration and sharpness of our forecasts is stable as long as 
forecasting horizons remain relatively short. However, we observe declining calibration and 
sharpness for forecasting horizons beyond a scope of six hours. So for medium-term forecasts. 
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we observe a PIT histogram which shows a certain degree of over-dispersion (omitted here to 
conserve space, available npon request). 
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Figure 12. Probability integral transform (PIT) histogram for lasso (first column) and elastic-net (second col¬ 
umn) for the forecasting horizon of one hour for station Miincheberg (first row) and Lindenberg 
(second row). 


5 Conclusion 

This article presents a new model for wind speed forecasting. The introduced periodic SVARX- 
TARCHX model is used to predict the wind speed, wind direction, air pressure and temperature. 
The model does so by means of non-trigonometric periodic B-spline functions. Moreover, we 
exploit the spatial distribution of the measurement stations and take conditional heteroscedas- 
ticity into account. 

As the parameters of the sophisticated ARFIMA-APARCH benchmark are estimated by us¬ 
ing numerical (quasi) maximum likelihood estimation, it usually takes more than one hour of 
computing time to hnd one out-of-sample forecast. In contrast to that, our SVARX-TARCHX 
model can be identified by modern shrinkage methods like lasso or elastic net, which takes only 
about six to eight minutes per forecast. Due to our iterative re-weighting scheme, the algorithm 
is still able to capture the heteroscedastic nature of the observed data. 

Overall, our model proposition is able to outperform the naive benchmark as well as the VAR. 
Even the more advanced ARFIMA-APARCH model is outperformed by a significant degree. 
Modeling periodicity by B-splines instead of Fourier series brings additional flexibility and nu¬ 
merical performance. Results show that the new model captures periodicity quite well. The 
new model provides a flexible framework with a lot of adjustment options, which makes it a 
universal tool for many kinds of settings in wind speed forecasting research. 
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Figure 13. Probability integral transform (PIT) histogram for lasso (first column) and elastic-net (second col¬ 
umn) for the forecasting horizon of four hours for station Miincheberg (first row) and Lindenberg 
(second row). 
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